ParaMor: Finding Paradigms across Morphology1

نویسندگان

  • Christian Monson
  • Jaime Carbonell
  • Alon Lavie
  • Lori Levin
چکیده

ParaMor automatically learns morphological paradigms from unlabelled text, and uses them to annotate word forms with morpheme boundaries. ParaMor competed in the English and German tracks of Morpho Challenge 2007 (Kurimo et al., 2008). In English, ParaMor’s balanced precision and recall outperform at F1 an already sophisticated baseline induction algorithm, Morfessor (Creutz, 2006). In German, ParaMor suffers from a low morpheme recall. But combining ParaMor’s analyses with analyses from Morfessor results in a set of analyses that outperform either algorithm alone, and that place first in F1 among all algorithms submitted to Morpho Challenge 2007. Categories and Subject Descriptions: I.2 [Artificial Intelligence]: I.2.7 Natural Language Processing

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

ParaMor: Finding Paradigms across Morphology

Our algorithm, ParaMor, fared well in Morpho Challenge 2007 (Kurimo et al., 2007), a peer operated competition pitting against one another algorithms designed to discover the morphological structure of natural languages from nothing more than raw text. ParaMor constructs sets of affixes closely mimicking the paradigms of a language, and, with these structures in hand, annotates word forms with ...

متن کامل

Resource-Light Acquisition of Inflectional Paradigms

This paper presents a resource-light acquisition of morphological paradigms and lexicon for fusional languages. It builds upon Paramor [10], an unsupervised system, by extending it: (1) to accept a small seed of manually provided word inflections with marked morpheme boundary; (2) to handle basic allomorphic changes acquiring the rules from the seed and/or from previously acquired paradigms. Th...

متن کامل

ParaMor: Minimally Supervised Induction of Paradigm Structure and Morphological Analysis

Paradigms provide an inherent organizational structure to natural language morphology. ParaMor, our minimally supervised morphology induction algorithm, retrusses the word forms of raw text corpora back onto their paradigmatic skeletons; performing on par with state-ofthe-art minimally supervised morphology induction algorithms at morphological analysis of English and German. ParaMor consists o...

متن کامل

Probabilistic ParaMor

The ParaMor algorithm for unsupervised morphology induction, which competed in the 2007 and 2008 Morpho Challenge competitions, does not assign a numeric score to its segmentation decisions. Scoring each character boundary in each word with the likelihood that it falls at a true morpheme boundary would allow ParaMor to adjust the confidence level at which the algorithm proposes segmentations. A...

متن کامل

Evaluating an Agglutinative Segmentation Model for ParaMor

This paper describes and evaluates a modification to the segmentation model used in the unsupervised morphology induction system, ParaMor. Our improved segmentation model permits multiple morpheme boundaries in a single word. To prepare ParaMor to effectively apply the new agglutinative segmentation model, two heuristics improve ParaMor’s precision. These precision-enhancing heuristics are adap...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007